The Data Warehouse of

نویسندگان

  • Himanshu Gupta
  • Divesh Srivastava
چکیده

Electronic newsgroups are one of the primary means for the dissemination, exchange and sharing of information. We argue that the current newsgroup model is unsatisfactory, especially when posted articles are relevant to multiple newsgroups. We demonstrate that considerable additional exibility can be achieved by managing newsgroups in a data warehouse, where each article is a tuple of attribute-value pairs, and each newsgroup is a view on the set of all posted articles. Supporting this paradigm for a large set of newsgroups makes it imperative to eeciently support a very large number of views: this is the key diierence between newsgroup data warehouses and conventional data warehouses. We identify two complementary problems concerning the design of such a newsgroup data warehouse. An important design decision that the system needs to make is which newsgroup views to eagerly maintain (i.e., materialize). We demonstrate the intractability of the general newsgroup-selection problem, consider various natural special cases of the problem, and present eecient exact/approximation algorithms and complexity hardness results for them. A second important task concerns the ee-cient incremental maintenance of the eagerly maintained newsgroups. The newsgroup-maintenance problem for our model of newsgroup dee-nitions is a more general version of the classical point-location problem, and we design an I/O and CPU eecient algorithm for this problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of the Analytical Queries Response Time in Real-Time Data Warehouse using Materialized Views Concatenation

A real-time data warehouse is a collection of recent and hierarchical data that is used for managers’ decision-making by creating online analytical queries. The volume of data collected from data sources and entered into the real-time data warehouse is constantly increasing. Moreover, as the volume of input data to the real time data warehouse increases, the interference between online loading ...

متن کامل

ارائه مدل تلفیقی برای ارزیابی آمادگی سازمان ها جهت پیاده سازی سیستم انباره داده با استفاده ازتحلیل سلسله مراتبی

Enterprise Data Warehouse initiative is a high investment project. The adoption of Data Warehouse will be significantly different depending upon the level of readiness of an organization. Before implementation of Data Warehouse system in a firm, it is necessary to evaluate the level of the readiness of firm. A successful Data Warehouse assessment model requires a deep understanding of opportuni...

متن کامل

Utility of Ranking Warehouse Candidates in Workshop Locations Using UTAStar

Although the importance of locating in manufacturing and service companies is not a new issue, one of significance applications is to determine the appropriate location for warehouses in manufacturing workshops warehouses to the maintenance of materials or products. In any organizations, Finding the suitable site for warehouses establishments to increase customer service and efficiency is one o...

متن کامل

افزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته

Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...

متن کامل

A Solution to View Management to Build a Data Warehouse

Several techniques exist to select and materialize a proper set of data in a suitable structure that manage the queries submitted to the online analytical processing systems. These techniques are called view management techniques, which consist of three research areas: 1) view selection to materialize, 2) query processing and rewriting using the materialized views, and 3) maintaining materializ...

متن کامل

The optimal warehouse capacity: A queuing-based fuzzy programming approach

Among the various existing models for the warehousing management, the simultaneous use of private and public warehouses is as the most well-known one. The purpose of this article is to develop a queuing theory-based model for determining the optimal capacity of private warehouse in order to minimize the total corresponding costs. In the proposed model, the available space and budget to create a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999